In [1]:
import datacube.api
from pprint import pprint
By default, the API will use the configured database connection found in the config file.
Details on setting up the config file and database and be found here: http://agdc-v2.readthedocs.org/en/develop/db_setup.html
In [2]:
dc = datacube.api.API()
In [3]:
dc.list_fields()
Out[3]:
The product
and platform
fields looks interesting. Find out more about them:
In [4]:
dc.list_field_values('product')
Out[4]:
In [5]:
dc.list_field_values('platform')
Out[5]:
There are several API calls the describe and provide data in different ways:
get_descriptor()
- provides a descripton of the data for a given queryget_data()
- provides the data as xarray.DataArray
s for each variable. This is usually called based on information returned by the get_descriptor
call.get_data_array()
- returns an xarray.DataArray
n-dimensional object, with the variables stack along the dimension labelled variables
.get_dataset()
- return an xarray.Dataset
object, containing an xarray.DataArray
for each variable.
In [6]:
query = {
'product': 'gamma0',
'platform': ['ALOS_2','SENTINEL_1A'],
}
descriptor = dc.get_descriptor(query, include_storage_units=False)
pprint(descriptor)
The query can be restricted to provide information on particular range along a dimension.
For spatial queries, the dimension names should be used. The default projection for the range query values is in WGS84, although
In [7]:
query = {
'product': 'gamma0',
'platform': ['ALOS_2','SENTINEL_1A'],
'dimensions': {
'x' : {
'range': (146.0, 147.0),
},
'y' : {
'range': (-42.0, -41.0),
},
'time': {
'range': ((2015, 1, 1), (2017, 1 ,2)),
}
}
}
pprint(dc.get_descriptor(query, include_storage_units=False))
A coordinate reference sytsem can be provided for the spatial dimensions, either as a EPSG code or a WKT description:
In [8]:
query = {
'product': 'gamma0',
'platform': ['ALOS_2','SENTINEL_1A'],
'dimensions': {
'x' : {
'range': (1187756.25, 1284918.75),
'crs': 'EPSG:3577',
},
'y' : {
'range': (-4666481.25,-4548968.75),
'crs': 'EPSG:3577',
},
'time': {
'range': ((2016, 1, 1), (2017, 1 ,1)),
}
}
}
This retrieves the data, usually as a subset, based on the information provided by the get_descriptor
call.
The query is in a similar form to the get_descriptor
call, with the addition of a variables
parameter. If not specified, all variables are returned.
The query also accepts an array_range
parameter on a dimension that provides a subset based on array indicies, rather than labelled coordinates.
In [10]:
query = {
'product': 'gamma0',
'platform': 'ALOS_2',
'variables': ['hh_gamma0', 'hv_gamma0'],
'dimensions': {
'x' : {
'range': (146, 147),
'array_range': (0, 1),
},
'y' : {
'range': (-41, -42),
'array_range': (0, 1),
},
'time': {
'range': ((2016, 1, 1), (2017, 1, 1))
}
}
}
data = dc.get_data(query)
data.keys()
Out[10]:
In [12]:
alos2 = dc.get_data_array(product='gamma0', platform='ALOS_2', y=(-41,-42), x=(146,147))
s1a = dc.get_data_array(product='gamma0', platform='SENTINEL_1A', y=(-41,-42), x=(146,147))
In [15]:
dc.get_dataset(product='gamma0', platform='SENTINEL_1A', y=(-41,-42), x=(146,147))
Out[15]:
In [ ]: